Skip to main content

Pipeline Monitoring and Documentation

Dashboard

The Dashboard provides a summary status of all pipeline executions - past (completed and failed runs), in progress (running) and scheduled. Users can filter their dashboard views on these parameters.

For failed executions, the reason is provided in the dashboard, and the pipeline can be retried manually from the dashboard.

Data Quality

DataStori runs the following test cases on all data pipelines:

  • Uniqueness test: Check for duplicates in all the primary key column/s.
  • Not null test: Check if the primary key columns contain a null or blank value. If they do, fail the pipeline and raise an alert.
  • Data freshness test: Check if the pipeline is fetching data as per the schedule by comparing actual and scheduled run-times, and counting the number of records fetched.

Schema Evolution

Schema evolution of source data is tracked by DataStori. Users can view the schema definition in the Silver folder, all the way from the starting schema to the current schema.

Error Notifications

To get notified of pipeline failures, go to the Notifications tab and add the email IDs of the people who need to be informed when any of your pipelines fails.

Data Rollback

The Delta file format supports versioning. If required, data can be rolled back to a selected previous version.

info

If you need to rollback your data to a prior version, please write to contact@datastori.io.

Pipeline Documentation

DataStori creates automated pipeline documentation on all configured pipelines. The documentation lists the data source and destinations, pipeline schedule and schema location.

This is available in the Documentation tab under Settings.